Solvers in Python
Daniel P. Palomar (2025). Portfolio Optimization: Theory and Application. Cambridge University Press.
Last update: February 7, 2026
Contributors:
This Python session will introduce different solvers in Python that can be used for portfolio optimization.
General solvers¶
General solvers can deal with arbitrary nonlinear optimization problems, at the expense of perhaps slow convergence and risk of numerical issues.
Package scipy¶
The scipy package offers several general purpose optimization routines
via the optimize module.
minimize_scalar(): for one-dimensional unconstrained function optimization over an interval (for one-dimensional root finding useroot_scalar())minimize(): for general-purpose optimization.
import numpy as np
from scipy import optimize as opt
# Define a function
def foo(x):
return np.exp(-.5 * x) * np.sin(10 * np.pi * x)
foo(0.5)
np.float64(4.768779430809241e-16)
res = opt.minimize_scalar(foo, bounds = [0, 1], method = "Bounded")
res
message: Solution found.
success: True
status: 0
fun: -0.8395633414865528
x: 0.34949347963438415
nit: 11
nfev: 11
# We can do it via a lambda (instead of a function definition):
res = opt.minimize_scalar(fun = lambda x: np.exp(-.5 * x) * np.sin(10 * np.pi * x),
bounds = [0, 1], method = "Bounded")
res
message: Solution found.
success: True
status: 0
fun: -0.8395633414865528
x: 0.34949347963438415
nit: 11
nfev: 11
# Plot using matplotlib
import matplotlib.pyplot as plt
x = np.linspace(0, 1, 1000)
plt.plot(x, foo(x))
plt.plot(res.x, res.fun, 'ro') # Plotting the point as before
plt.show()
# Plot using seaborn
import seaborn as sns
sns.set(style="darkgrid")
x = np.linspace(0, 1, 1000)
sns.lineplot(x=x, y=foo(x))
sns.scatterplot(x=[res.x], y=[res.fun], color="red", s=100)
<Axes: >
minimize(): General-purpose optimization function with 14 different optimization methods (see documentation):Nelder-Mead: uses the Simplex algorithm. This algorithm is robust in many applications. However, if numerical computation of derivative can be trusted, other algorithms using the and/or second derivatives information might be preferred for their better performance in general.Powell: very fast, derivative-free implementation of Powell's methodCG: uses a nonlinear conjugate gradient algorithm by Polak and Ribiere, a variant of the Fletcher-Reeves method. Only the first derivatives are used.BFGS: uses the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS). It uses the first derivatives only. BFGS has proven good performance even for non-smooth optimizations. This method also returns an approximation of the Hessian inverse.L-BFGS-B: uses the L-BFGS-B algorithm for bound constrained minimization.TNC: uses a truncated Newton algorithm to minimize a function with variables subject to bounds. This algorithm uses gradient information; it is also called Newton Conjugate-Gradient. It differs from the Newton-CG method described above as it wraps a C implementation and allows each variable to be given upper and lower bounds.Newton-CG: uses a Newton-CG algorithm (also known as the truncated Newton method). It uses a CG method to the compute the search direction.COBYLA: uses the Constrained Optimization BY Linear Approximation (COBYLA) method. The algorithm is based on linear approximations to the objective function and each constraint. The method wraps a FORTRAN implementation of the algorithm. The constraints functions ‘fun’ may return either a single number or an array or list of numbers.SLSQP: uses Sequential Least SQuares Programming to minimize a function of several variables with any combination of bounds, equality and inequality constraints. The method wraps the SLSQP Optimization subroutine originally implemented by Dieter Kraft. Note that the wrapper handles infinite values in bounds by converting them into large floating values.trust-constr: is a trust-region algorithm for constrained optimization. It swiches between two implementations depending on the problem definition. It is the most versatile constrained minimization algorithm implemented in SciPy and the most appropriate for large-scale problems. For equality constrained problems it is an implementation of Byrd-Omojokun Trust-Region SQP method.trust-ncg: uses the Newton conjugate gradient trust-region algorithm for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems.trust-krylov: uses the Newton GLTR trust-region algorithm for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems. On indefinite problems it requires usually less iterations than the trust-ncg method and is recommended for medium and large-scale problems.trust-exact: is a trust-region method for unconstrained minimization in which quadratic subproblems are solved almost exactly. This algorithm requires the gradient and the Hessian (which is not required to be positive definite). It is, in many situations, the Newton method to converge in fewer iteraction and the most recommended for small and medium-size problems.dogleg: uses the dog-leg trust-region algorithm for unconstrained minimization. This algorithm requires the gradient and Hessian; furthermore the Hessian is required to be positive definite.
Examples¶
This example does a least squares fitting: $$ \begin{array}{ll} \underset{a,b}{\textsf{minimize}} & \sum\limits_{i=1}^{m} \left(y_i - (a+bx_i)\right)^2 \end{array} $$
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
# data points to be fitted
x = 1 + np.arange(6)
y = np.array([1, 3, 5, 6, 8, 12])
# l2-norm error function
def obj_fun(param):
return np.sum((y - (param[0] + param[1] * x))**2)
# call solver with initial value [0, 1]
res = minimize(obj_fun, x0 = np.array([0, 1]), method = 'Nelder-Mead')
print(res)
message: Optimization terminated successfully.
success: True
status: 0
fun: 2.819047623875957
x: [-1.267e+00 2.029e+00]
nit: 64
nfev: 126
final_simplex: (array([[-1.267e+00, 2.029e+00],
[-1.267e+00, 2.029e+00],
[-1.267e+00, 2.029e+00]]), array([ 2.819e+00, 2.819e+00, 2.819e+00]))
plt.plot(x, y, 'ko') # plot linear regression
plt.plot(x, res.x[0] + res.x[1] * x, 'k-')
[<matplotlib.lines.Line2D at 0x11ba7ad50>]
The next example illustrates the use of the gradient with the famous Rosenbrock banana function: $$ f(\mathbf{x}) = 100(x_2-x_1^2)^2 + (1-x_1)^2 $$ with gradient $$ \nabla f(\mathbf{x}) = \left[ \begin{array}{c} -400x_1(x_2-x_1^2) -2(1-x_1)\\ 200(x_2-x_1^2) \end{array}\right] $$ with the unconstrained minimization problem $$ \begin{array}{ll} \underset{\mathbf{x}\in \mathbb{R}^2}{\textsf{minimize}} & 100(x_2-x_1^2)^2 + (1-x_1)^2 \end{array} $$
# Rosenbrock banana function and its gradient
def f_banana(x):
return(100 * (x[1] - x[0] * x[0])**2 + (1 - x[0])**2)
def gr_banana(x):
return(np.array([[-400 * x[0] * (x[1] - x[0] * x[0]) - 2 * (1 - x[0])],
[200 * (x[1] - x[0] * x[0])]]).reshape((-1))) # reshape((-1)) converts the column matrix into a vector
res = minimize(f_banana, np.array([-1.2, 1]), method='Nelder-Mead')
res
message: Optimization terminated successfully.
success: True
status: 0
fun: 8.177661197416674e-10
x: [ 1.000e+00 1.000e+00]
nit: 85
nfev: 159
final_simplex: (array([[ 1.000e+00, 1.000e+00],
[ 1.000e+00, 1.000e+00],
[ 1.000e+00, 1.000e+00]]), array([ 8.178e-10, 1.108e-09, 1.123e-09]))
res = minimize(f_banana, np.array([-1.2, 1]), method='BFGS', jac = gr_banana)
res
message: Optimization terminated successfully.
success: True
status: 0
fun: 2.5353073555215442e-15
x: [ 1.000e+00 1.000e+00]
nit: 32
jac: [-1.705e-06 8.233e-07]
hess_inv: [[ 5.019e-01 1.003e+00]
[ 1.003e+00 2.010e+00]]
nfev: 39
njev: 39
This example uses the simulated annealing method (for global optimization):
from scipy import optimize as opt
# "wild" function , global minimum at about -15.81515
def f_wild(x):
return(10*np.sin(0.3*x) * np.sin(1.3 * x**2) + 0.00001 * x**4 + 0.2 * x + 80)
t = np.linspace(-50, 50, 1000)
res = opt.basinhopping(f_wild, 50, niter = 500)
res
message: ['requested number of basinhopping iterations completed successfully']
success: True
fun: 68.94033698598757
x: [-5.496e+00]
nit: 500
minimization_failures: 58
nfev: 10868
njev: 5225
lowest_optimization_result: message: Optimization terminated successfully.
success: True
status: 0
fun: 68.94033698598757
x: [-5.496e+00]
nit: 5
jac: [ 2.861e-06]
hess_inv: [[ 4.909e-04]]
nfev: 12
njev: 6
res = minimize(f_wild, res.x, method='BFGS') # now improve locally (typically only by a small bit):
res
message: Optimization terminated successfully.
success: True
status: 0
fun: 68.94033698598757
x: [-5.496e+00]
nit: 0
jac: [ 2.861e-06]
hess_inv: [[1]]
nfev: 2
njev: 1
plt.plot(t, f_wild(t), 'k-')
plt.plot(res.x, f_wild(res.x), 'ro')
[<matplotlib.lines.Line2D at 0x11bc451f0>]
Note that some methods accept bounds on the optimization variable via the argument bounds.
However, make sure to verify in the documentation
whether the particular method accepts bounds or not.
Solvers for specific classes of problems¶
If the problem to be solved falls within a particular class of problems, such as LS, LP, MILP, QP, SOCP, or SDP, then it is much better to use a solver specific for that class.
Least Squares (LS)¶
A linear least-squares (LS) problem minimizes the $\ell_2$-norm $\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_2$ possibly
with bounds or linear constraints. The function scipy.optimize.lsq_linear from scipy
solves over- and underdetermined linear systems in the least-squares sense.
Linear Programming (LP)¶
The package scipy contains the function scipy.optimize.linprog (documentation) to conveniently solve LPs of the form (it can also contain equality constraints):
$$
\begin{array}{ll}
\underset{\mathbf{x}\in \mathbb{R}^n}{\textsf{minimize}} & \mathbf{c}^T\mathbf{x}\\
{\textsf{subject to}} & \begin{array}[t]{l} \mathbf{A}\mathbf{x} \le \mathbf{b}\\
\mathbf{l} \le \mathbf{x} \le \mathbf{u}.\end{array}\\
\end{array}
$$
Let's consider the following LP: $$ \begin{array}{ll} \underset{x_0, x_1}{\textsf{minimize}} & 4 x_1 - x_0\\ {\textsf{subject to}} & \begin{array}[t]{l} x_1 - 3 x_0 \leq 6\\ x_0 + 2x_1 \leq 4\\ x_1 \geq -3 \\ -\infty \leq x_0 \leq \infty \end{array} \end{array} $$
from scipy.optimize import linprog
c = [-1, 4]
A = [[-3, 1], [1, 2]]
b = [6, 4]
x0_bounds = (None, None)
x1_bounds = (-3, None)
res = linprog(c, A_ub=A, b_ub=b, bounds=(x0_bounds, x1_bounds), options={"disp": True})
res
Running HiGHS 1.8.0 (git hash: 222cce7): Copyright (c) 2024 HiGHS under MIT licence terms Coefficient ranges: Matrix [1e+00, 3e+00] Cost [1e+00, 4e+00] Bound [3e+00, 3e+00] RHS [4e+00, 6e+00] Presolving model 0 rows, 0 cols, 0 nonzeros 0s 0 rows, 0 cols, 0 nonzeros 0s Presolve : Reductions: rows 0(-2); columns 0(-2); elements 0(-4) - Reduced to empty Solving the original LP from the solution after postsolve Model status : Optimal Objective value : -2.2000000000e+01 Relative P-D gap : 0.0000000000e+00 HiGHS run time : 0.00
message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
success: True
status: 0
fun: -22.0
x: [ 1.000e+01 -3.000e+00]
nit: 0
lower: residual: [ inf 0.000e+00]
marginals: [ 0.000e+00 6.000e+00]
upper: residual: [ inf inf]
marginals: [ 0.000e+00 0.000e+00]
eqlin: residual: []
marginals: []
ineqlin: residual: [ 3.900e+01 0.000e+00]
marginals: [-0.000e+00 -1.000e+00]
mip_node_count: 0
mip_dual_bound: 0.0
mip_gap: 0.0
Quadratic Programming (QP)¶
The package quadprog
contains the function solve_qp to conveniently solve QPs of the form:
$$
\begin{array}{ll}
\underset{\mathbf{x}\in \mathbb{R}^n}{\textsf{minimize}} &
-\mathbf{a}^T\mathbf{x} + \frac{1}{2}\mathbf{x}^T\mathbf{G}\mathbf{x}\\
{\textsf{subject to}} & \mathbf{C}^T\mathbf{x} \geq \mathbf{b}\\
\end{array}
$$
from quadprog import solve_qp # pip install quadprog
import numpy as np
# Set up problem:
# minimize -(0 5 0) x + 1/2 x^T x
# subject to A^T x >= b
# with b = (-8,2,0)^T
# (-4 2 0)
# A = (-3 1 -2)
# ( 0 0 1)
Gmat = np.eye(3)
avec = np.array([0., 5., 0.])
Cmat = np.array([[-4., -3., 0.],
[2., 1., 0.],
[0., -2., 1.]])
bvec = np.array([-8., 2., 0.])
# run solver
res = solve_qp(Gmat, avec, Cmat, bvec)
res
(array([0., 5., 0.]), -12.5, array([0., 5., 0.]), array([1, 0], dtype=int32), array([0., 0., 0.]), array([], dtype=int32))
The package osqp offers a object-oriented way to solve QPs written in the following form: $$ \begin{array}{ll} \underset{\mathbf{x}}{\mbox{minimize}} & \frac{1}{2} \mathbf{x}^T \mathbf{P} \mathbf{x} + \mathbf{q}^T \mathbf{x} \\ \mbox{subject to} & \mathbf{l} \leq \mathbf{A} \mathbf{x} \leq \mathbf{u} \end{array} $$
import osqp # pip install osqp
import numpy as np
from scipy import sparse
# Generate problem data
np.random.seed(1)
m = 30
n = 20
Ad = sparse.random(m, n, density=0.7, format='csc')
b = np.random.randn(m)
# OSQP data
P = sparse.block_diag([sparse.csc_matrix((n, n)), sparse.eye(m)], format='csc')
q = np.zeros(n+m)
A = sparse.vstack([
sparse.hstack([Ad, -sparse.eye(m)]),
sparse.hstack([sparse.eye(n), sparse.csc_matrix((n, m))])], format='csc')
l = np.hstack([b, np.zeros(n)])
u = np.hstack([b, np.ones(n)])
# Create an OSQP object
prob = osqp.OSQP()
# Setup workspace
prob.setup(P, q, A, l, u)
res = prob.solve() # Solve problem
print(f"Status: {res.x[:5]}...") # Show first 5 elements
print(f"Optimal value: {res}")
-----------------------------------------------------------------
OSQP v1.0.0 - Operator Splitting QP Solver
(c) The OSQP Developer Team
-----------------------------------------------------------------
problem: variables n = 50, constraints m = 50
nnz(P) + nnz(A) = 500
settings: algebra = Built-in,
OSQPInt = 4 bytes, OSQPFloat = 8 bytes,
linear system solver = QDLDL v0.1.8,
eps_abs = 1.0e-03, eps_rel = 1.0e-03,
eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04,
rho = 1.00e-01 (adaptive: 50 iterations),
sigma = 1.00e-06, alpha = 1.60, max_iter = 4000
check_termination: on (interval 25, duality gap: on),
time_limit: 1.00e+10 sec,
scaling: on (10 iterations), scaled_termination: off
warm starting: on, polishing: off,
iter objective prim res dual res gap rel kkt rho time
1 0.0000e+00 2.50e+00 4.47e+02 -3.58e+03 4.47e+02 1.00e-01 8.65e-05s
50 1.4079e+01 1.49e-01 2.73e-04 -1.69e+00 1.49e-01 2.39e+00* 3.32e-04s
75 1.6487e+01 4.25e-04 2.63e-04 -7.88e-03 4.25e-04 2.39e+00 3.80e-04s
status: solved
number of iterations: 75
optimal objective: 16.4870
dual objective: 16.4949
duality gap: -7.8795e-03
primal-dual integral: 3.5820e+03
run time: 3.93e-04s
optimal rho estimate: 3.25e+00
Status: [ 1.16271794e-01 -1.61643557e-04 1.70167161e-01 -3.94214608e-04
-1.82469015e-04]...
Optimal value: namespace(x=array([ 1.16271794e-01, -1.61643557e-04, 1.70167161e-01, -3.94214608e-04,
-1.82469015e-04, -6.65480745e-05, -1.17548212e-04, -1.86604664e-04,
-2.37768087e-04, -4.24806828e-04, 7.55935663e-02, -1.68698146e-04,
-2.57139565e-04, -2.24347890e-04, -2.01812958e-04, 2.41442430e-05,
-2.03870128e-04, 3.46764616e-01, -8.07295133e-05, -2.90484448e-04,
3.29388604e-02, -2.13451312e-01, -1.57046657e+00, 1.35985297e+00,
6.16781533e-02, 1.23322122e+00, -3.96582518e-01, 9.86134562e-01,
-7.43102945e-01, -1.19476661e+00, 2.32567427e-01, 1.14871954e+00,
1.23942885e+00, -8.31436652e-03, -4.92103767e-01, -1.67001256e+00,
2.85133392e+00, 1.12197044e+00, -5.97523530e-01, 5.32090457e-03,
-2.56936966e-01, -1.17695555e+00, 4.29516710e-01, 1.74325657e+00,
-1.30655245e-01, 8.55495862e-01, 4.86696240e-02, 4.88486562e-01,
1.29894245e+00, -9.68940171e-01]), y=array([ 3.29388604e-02, -2.13451312e-01, -1.57046657e+00, 1.35985297e+00,
6.16781534e-02, 1.23322122e+00, -3.96582518e-01, 9.86134562e-01,
-7.43102945e-01, -1.19476661e+00, 2.32567427e-01, 1.14871954e+00,
1.23942885e+00, -8.31436653e-03, -4.92103767e-01, -1.67001256e+00,
2.85133392e+00, 1.12197044e+00, -5.97523530e-01, 5.32090469e-03,
-2.56936966e-01, -1.17695555e+00, 4.29516710e-01, 1.74325657e+00,
-1.30655245e-01, 8.55495862e-01, 4.86696240e-02, 4.88486562e-01,
1.29894245e+00, -9.68940172e-01, 0.00000000e+00, -2.09016273e+00,
0.00000000e+00, -2.82520974e+00, -4.01571752e+00, -1.46825063e+00,
-1.77059921e+00, -4.09291956e+00, -2.34218193e+00, -2.51005536e+00,
6.93889390e-19, -4.09335448e+00, -1.89704203e+00, -2.66674720e+00,
-2.85187950e+00, -3.66497675e-02, -3.02740080e-01, 0.00000000e+00,
-1.43639144e+00, -2.18811320e+00]), prim_inf_cert=array([2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09]), dual_inf_cert=array([2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
2.14328934e+09, 2.14328934e+09]), info=namespace(_pybind11_conduit_v1_=<bound method pybind11_detail_function_record_v1_system_libcpp_abi1._pybind11_conduit_v1_ of <osqp.ext_builtin.OSQPInfo object at 0x11bf40c70>>, status='solved', status_val=1, status_polish=0, obj_val=16.4869734917924, dual_obj_val=16.494853011581807, prim_res=0.00042480682772998775, dual_res=0.0002628840560558355, duality_gap=-0.007879519789405265, iter=75, rho_updates=1, rho_estimate=3.247080303250078, setup_time=6.9958e-05, solve_time=0.000323291, update_time=0.0, polish_time=0.0, run_time=0.00039324900000000003, primdual_int=3581.9988680820356, rel_kkt_error=0.00042480682772998775))
Comparison: quadprog vs OSQP¶
- quadprog: Simpler interface, good for small-to-medium problems
- OSQP: More flexible, handles larger problems, supports warm-starting
- Performance: OSQP typically faster for problems with > 1000 variables
import time
# Compare quadprog vs OSQP
start = time.time()
res1 = solve_qp(Gmat, avec, Cmat, bvec)
time_quadprog = time.time() - start
start = time.time()
res2 = prob.solve()
time_osqp = time.time() - start
print(f"quadprog: {time_quadprog:.4f}s")
print(f"OSQP: {time_osqp:.4f}s")
iter objective prim res dual res gap rel kkt rho time 1 1.6489e+01 3.39e-04 2.10e-04 -6.29e-03 3.39e-04 2.39e+00 2.79e-05s 25 1.6495e+01 1.52e-06 9.42e-07 -2.82e-05 1.52e-06 2.39e+00 1.21e-04s status: solved number of iterations: 25 optimal objective: 16.4950 dual objective: 16.4950 duality gap: -2.8229e-05 primal-dual integral: 3.5820e+03 run time: 1.87e-04s optimal rho estimate: 3.25e+00 quadprog: 0.0000s OSQP: 0.0003s
Second-Order Cone Programming (SOCP)¶
There are several packages:
- Package ecos-python provides an interface to the Embedded COnic Solver (ECOS), a well-known, efficient, and robust C library for convex problems. Conic and equality constraints can be specified in addition to integer and boolean variable constraints for mixed-integer problems.
Cone Programming and Semi-Definite Programming (SDP)¶
There are several good packages:
Package Irene Irene was originally written to find reliable approximations for optimum value of an arbitrary optimization problem. It implements a modification of Lasserre’s SDP Relaxations based on generalized truncated moment problem to handle general optimization problems algebraically.
Package mosek contains a Python interface to the (commercial) MOSEK optimization library for large-scale LP, QP, and MIP problems, with emphasis on (nonlinear) conic, semidefinite, and convex tasks; academic licenses are available.
Package scs is a numerical optimization package for solving large-scale convex cone problems, based on the paper Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding.
Mixed Integer Programming¶
- Package mip is a collection of Python tools for the modeling and solution of Mixed-Integer Linear programs (MIPs). It provides access to advanced solver features like cut generation, lazy constraints, MIP starts and solution pools.
Modeling Language for Convex Optimization (CVXPY)¶
CVXPY is a Python package that provides an object-oriented modeling language for convex optimization, similar to CVX, CVXR, YALMIP, and Convex.jl. It allows the user to formulate convex optimization problems in a natural mathematical syntax rather than the restrictive standard form required by most solvers. The user specifies an objective and set of constraints by combining constants, variables, and parameters using a library of functions with known mathematical properties. CVXR then applies signed disciplined convex programming (DCP) to verify the problem’s convexity. Once verified, the problem is converted into standard conic form using graph implementations and passed to a cone solver such as ECOS or SCS. See CVX tutorials for details. Let's see now several examples.
Least squares
Let's start with a simple LS example: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|_2^2\\ \end{array} $$
Let's first solve it via Scipy lsq_linear:
import numpy as np
from scipy.optimize import lsq_linear
# generate data
m = 100
n = 10
beta_true = np.arange(10) - 4
X = np.random.normal(size=n*m).reshape((m,-1))
y = X @ beta_true + np.random.normal(size=m)
# estimate x
res = lsq_linear(X, y)
res
message: The unconstrained solution is optimal.
success: True
status: 3
fun: [-1.111e+00 1.494e+00 ... 2.154e-01 -6.792e-01]
x: [-4.016e+00 -3.009e+00 -2.134e+00 -6.944e-01 -5.416e-02
7.346e-01 1.870e+00 3.207e+00 3.844e+00 5.001e+00]
nit: 0
cost: 50.55498554873579
optimality: 3.552713678800501e-13
active_mask: [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00]
unbounded_sol: (array([-4.016e+00, -3.009e+00, -2.134e+00, -6.944e-01,
-5.416e-02, 7.346e-01, 1.870e+00, 3.207e+00,
3.844e+00, 5.001e+00]), array([ 1.011e+02]), np.int32(10), array([ 1.224e+01, 1.170e+01, 1.101e+01, 1.035e+01,
1.006e+01, 9.406e+00, 9.177e+00, 9.033e+00,
7.789e+00, 7.247e+00]))
Now let's do it with CVXPY:
import cvxpy as cp # pip install cvxpy
import numpy as np
# define variable
beta = cp.Variable(n)
# define objective
objective = cp.Minimize(cp.sum_squares(X @ beta - y))
# create problem
prob = cp.Problem(objective)
# solve it!
result = prob.solve()
beta.value
array([-4.01608366, -3.00908591, -2.13356382, -0.69438355, -0.05416262,
0.73455804, 1.86975671, 3.20690015, 3.84395209, 5.00103773])
We can now easily add a constraint to solve a nonnegative LS:
constraints = [beta >= 0]
prob = cp.Problem(objective, constraints)
result = prob.solve()
beta.value
array([-5.26257863e-17, -5.71912473e-17, -7.18335878e-17, -1.11541124e-16,
4.05790128e-01, 1.60249957e+00, 1.93370680e+00, 2.53939243e+00,
3.92161170e+00, 5.76468454e+00])
We can add other constraints:
constraint1 = beta[2] + beta[3] <= 0
A = np.diag(np.concatenate([np.array([1, 0, 0]), np.array([1] * 7)]))
print(A)
[[1 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 1 0 0 0 0 0 0] [0 0 0 0 1 0 0 0 0 0] [0 0 0 0 0 1 0 0 0 0] [0 0 0 0 0 0 1 0 0 0] [0 0 0 0 0 0 0 1 0 0] [0 0 0 0 0 0 0 0 1 0] [0 0 0 0 0 0 0 0 0 1]]
constraint2 = A @ beta >= 0
# define problem and solve it
prob = cp.Problem(objective, constraints = [constraint1, constraint2])
result = prob.solve()
beta.value
array([-8.12587069e-17, -2.55930726e+00, -2.38179431e+00, -7.56294003e-18,
5.69530272e-01, 1.56146521e+00, 1.83898696e+00, 2.49285673e+00,
3.68068355e+00, 5.85268505e+00])
Robust Huber regression
Let's consider another simple example of robust regression: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \sum\limits_{i=1}^m \phi(y_i - \mathbf{x}_i^T\boldsymbol{\beta}) \end{array} $$ where $$ \phi(u) = \begin{cases} u^2 & \text{if }|u| \le M \\ 2Mu-M^2 & \text{if }|u| > M \end{cases} $$
M = 0.1
beta = cp.Variable(n)
obj = cp.sum(cp.huber(y - X @ beta, M))
cp.Problem(cp.Minimize(obj)).solve()
beta.value
array([-4.17042853, -3.04044983, -2.10513397, -0.71206119, -0.01068557,
0.76251828, 1.8334574 , 3.23198934, 3.77406989, 4.90310484])
Elastic net regularization
We will now solve the problem: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \frac{1}{2m}\|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|_2^2 + \lambda (1-\alpha)\|\boldsymbol{\beta}\|_2^2 + \alpha\| \boldsymbol{\beta} \|_1 \end{array} $$
# define the regularization term
def elastic_reg(beta, lmd = 0, alpha = 0):
ridge = (1 - alpha) * cp.sum(beta ** 2)
lasso = alpha * cp.norm(beta, 1)
return lmd * (lasso + ridge)
# define problem and solve it
alpha = 0.5
lmd = 0.5
beta = cp.Variable(n)
obj = cp.sum((y - X @ beta)**2) + elastic_reg(beta, lmd, alpha)
cp.Problem(cp.Minimize(obj)).solve()
beta.value
array([-4.00380815, -2.99831083, -2.12711484, -0.69208925, -0.05285668,
0.73604469, 1.86351502, 3.19408019, 3.83259749, 4.9906979 ])
Sparse inverse covariance estimation
Note on Solvers: This problem requires an SDP solver. CVXPY will automatically select:
- SCS (if installed:
pip install scs) - CVXOPT (if installed:
pip install cvxopt) - Or fail if no suitable solver is available
Consider the matrix-valued convex problem: $$ \begin{array}{ll} \underset{\mathbf{X}}{\textsf{maximize}} & \textsf{logdet}(\mathbf{X}) - \textsf{Tr}(\mathbf{X}\mathbf{S})\\ \textsf{subject to} & \mathbf{X} \succeq \mathbf{0}, \qquad \sum_{i=1}^n\sum_{j=1}^n\left|X_{ij}\right| \le \alpha \end{array} $$
n = 3
np.random.seed(1)
S = np.random.randn(n, n)
S = (S + S.T) / 2 # Make S symmetric
X = cp.Variable((n, n), PSD=True)
obj = cp.log_det(X) - cp.trace(X @ S)
alpha = 1
constr = [cp.sum(cp.abs(X)) <= alpha]
result = cp.Problem(cp.Maximize(obj), constr).solve(solver=cp.SCS)
X.value
array([[ 2.64619429e-01, 1.06538048e-07, -3.78401286e-08],
[ 1.06538048e-07, 3.31118026e-01, 8.97513152e-08],
[-3.78401286e-08, 8.97513152e-08, 4.04255112e-01]])
np.linalg.eig(X.value)[0] # compute the eigenvalues of X to verify it is PD
array([0.26461943, 0.33111803, 0.40425511])
When specifying a solver, we should make sure that it is installed. For example, to use SCS:
# Check available solvers
print("Available solvers:", cp.installed_solvers())
Portfolio Optimization¶
Consider the following mean-variance portfolio optimization problem: $$ \begin{array}{ll} \underset{\boldsymbol{w}}{\text{maximize}} & \boldsymbol{w}^\mathsf{T}\boldsymbol{\mu} - \frac{1}{2}\boldsymbol{w}^\mathsf{T}\boldsymbol{\Sigma}\boldsymbol{w}\\ \text{subject to} & \boldsymbol{1}^\mathsf{T}\boldsymbol{w}=1, \quad \boldsymbol{w}\ge\boldsymbol{0} \end{array} $$
We will first prepare the data and then we will solve it via three different methods:
- Using quadprog
- Using CVXPY
- Using skfolio
# Mean-Variance Portfolio Optimization Example
# Problem: maximize mu^T * w - (lambda/2) * w^T * Sigma * w
# subject to: sum(w) = 1, w >= 0
# Equivalently: minimize -mu^T * w + (lambda/2) * w^T * Sigma * w
import numpy as np
from sklearn.model_selection import train_test_split
from skfolio.datasets import load_sp500_dataset
from skfolio.preprocessing import prices_to_returns
# Load and prepare data (using 5 assets for clarity)
prices = load_sp500_dataset()
prices = prices.iloc[:, :5] # Select first 5 assets
X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)
# Calculate mean returns and covariance matrix
mu = X_train.mean().values # Expected returns
Sigma = X_train.cov().values # Covariance matrix
n = len(Sigma) # Number of assets
lmbda = 1.0 # Risk aversion parameter
Mean-Variance Portfolio via quadprog¶
# --- Using quadprog ---
from quadprog import solve_qp
# minimize 1/2 * x^T * G * x - a^T * x
# subject to: C^T * x >= b
G = lmbda * Sigma.copy()
a = mu.copy()
C = np.vstack([np.ones(n), np.eye(n)]).T # [sum=1, w>=0]
b = np.hstack([1, np.zeros(n)])
weights_quadprog = solve_qp(G, a, C, b, meq=1)[0] # meq=1: first constraint is equality
print("Weights (quadprog):")
print(np.round(weights_quadprog, 4))
print(f"Sum: {weights_quadprog.sum():.6f}\n")
Mean-Variance Portfolio via CVXPY¶
# --- Using CVXPY ---
import cvxpy as cp
w = cp.Variable(n)
objective = cp.Minimize(-mu @ w + (lmbda/2) * cp.quad_form(w, Sigma))
constraints = [cp.sum(w) == 1, w >= 0]
problem = cp.Problem(objective, constraints)
problem.solve()
weights_cvxpy = w.value
print("Weights (CVXPY):")
print(np.round(weights_cvxpy, 4))
print(f"Sum: {weights_cvxpy.sum():.6f}\n")
Mean-Variance Portfolio via skfolio¶
# --- Using skfolio ---
from skfolio.optimization import MeanRisk, ObjectiveFunction
from skfolio import RiskMeasure
model = MeanRisk(
objective_function=ObjectiveFunction.MAXIMIZE_UTILITY, # uses mean - λ·risk
risk_measure=RiskMeasure.VARIANCE,
risk_aversion=lmbda / 2.0,
min_weights=0.0, # long-only
budget=1.0 # weights sum to 1
)
model.fit(X_train)
weights_skfolio = model.weights_
print("Weights (skfolio):")
print(np.round(weights_skfolio, 4))
print(f"Sum: {weights_skfolio.sum():.6f}\n")
# Verify all methods give same results
print("\nVerification:")
print(f"quadprog vs CVXPY: {np.allclose(weights_quadprog, weights_cvxpy, atol=1e-3)}")
print(f"CVXPY vs skfolio: {np.allclose(weights_cvxpy, weights_skfolio, atol=1e-3)}")
print(f"quadprog vs skfolio: {np.allclose(weights_quadprog, weights_skfolio, atol=1e-3)}")
Conclusion¶
The solvers available in Python should be sufficient for optimization problems in finance.
Decision Tree for Solver Selection:¶
Is your problem convex?
- YES → Start with CVXPY for rapid prototyping
- NO → Use
scipy.optimize.minimizewith appropriate methods
Need better performance?
- LP: Use
scipy.optimize.linprog - QP: Use
quadprog.solve_qp(small) orosqp(large) - SDP: Ensure SCS or CVXOPT or MOSEK is installed
- LP: Use
Need global optimization?
- Use
scipy.optimize.basinhoppingordifferential_evolution
- Use
Performance Tips:¶
- Provide analytical gradients when possible (2-10x speedup)
- For large problems (n > 1000), use specialized solvers
- Warm-start iterative solvers when solving similar problems repeatedly